Back

Microbial Genomics

Microbiology Society

Preprints posted in the last 90 days, ranked by how well they match Microbial Genomics's content profile, based on 204 papers previously published here. The average preprint has a 0.11% match score for this journal, so anything above that is already an above-average fit.

1
A stable, hierarchical LIN code system for Campylobacter jejuni and Campylobacter coli: A unified genomic nomenclature for lineage-level typing and global surveillance.

Parfitt, K. M.; Pascoe, B.; Jolley, K. A.; Douglas, A.; Goforth, M. P.; Sheppard, S. K.; Maiden, M. C. J.; Colles, F. M.

2026-02-10 microbiology 10.64898/2026.02.10.705007 medRxiv
Top 0.1%
48.6%
Show abstract

Campylobacter remains the leading cause of bacterial gastroenteritis worldwide, with C. jejuni accounting for around 90% of infection and C. coli accounting for most of the rest. Seven-locus multilocus sequence typing (MLST) has improved our understanding of host association and population structure, whilst core genome MLST (cgMLST), enables investigation of transmission events at high-resolution. However, the lack of a stable and standardised nomenclature for clustering of cgMLST data has limited reproducibility and long-term comparability between studies. Here we introduce a joint, hierarchical Life Identification Number (LIN) code system that provides reproducible, multi-level genomic identifiers for C. jejuni and C. coli lineages. Using an updated cgMLST v2 scheme (1,142 loci) and globally representative datasets of high-quality genomes selected from over 53,000 assemblies in the Campylobacter PubMLST database (https://pubmlst.org/organisms/campylobacter-jejunicoli), we firstly defined LIN codes on a dataset of 5,664 genomes. Pairwise allelic distances were computed using MSTclust, and 18 nested thresholds were defined through silhouette, adjusted Wallace and adjusted Rand Index (ARI) statistics to capture the population structure from species to outbreak level resolution. The LIN thresholds were then validated using a second dataset of 1,781 genomes from PubMLST and applied to a large water-associated outbreak dataset from New Zealand in 2016, containing clinical and ecological genomes. Further application of LIN codes was demonstrated by analyses of the C. jejuni ST-21 clonal complex and ST-6175 isolates, as well as the broader population structure of C. coli, using data from PubMLST. Across all datasets, LIN clusters were stable, largely monophyletic, and back-compatible with existing nomenclature, accurately distinguishing host-adapted and outbreak-associated lineages. By embedding cgMLST data within a stable and scalable nomenclature, the Campylobacter LIN system delivers consistent, automated genome-to-lineage assignment. This unified framework bridges population genetics and applied surveillance, enabling robust, real-time comparison of Campylobacter isolates across sources, studies, and time. Impact statementHuman cases of Campylobacter worldwide continue unabated. Tracing the source of Campylobacter infection is particularly challenging given the sporadic or multi-source nature of outbreaks, with potential transmission from foodborne, animal or environmental sources. Seven-locus MLST has greatly improved our broad understanding of Campylobacter population structure. However, whilst high-resolution cgMLST alleles and STs themselves do not change, longitudinal cluster analyses of cgMLST data have lacked a stable nomenclature, rendering them unsuitable for robust and comparable surveillance over time. Life Identification Number (LIN) codes provide a solution to this problem, establishing an automated and scalable nomenclature derived directly from cgMLST profiles, that is stable over time. We have implemented a joint C. jejuni and C. coli LIN code scheme in PubMLST, with scripts for real-time lineage assignment. LIN codes are back-compatible with existing MLST nomenclature, and we demonstrate their added practical value for exploring population structure and high-resolution outbreak investigation. LIN codes support surveillance of Campylobacter in a One Health context, by enabling consistent typing at multiple levels across different sources, laboratories and time. Data summary1. The isolate collections used to develop the LIN codes are publicly available and searchable as individual projects on the PubMLST database (https://pubmlst.org). O_LILIN code development (Dataset 1) (n=5,664 isolates, up to 200 isolates per clonal complex) C_LIO_LILIN code validation (Dataset 2) (n=1,781 isolates), up to 50 isolates per clonal complex C_LIO_LIOutbreak investigation (Dataset 3): New Zealand 2016 Havelock North waterborne outbreak, Gilpin et al (n=161 isolates) [1] C_LIO_LIPopulation structure exploration (clonal complex) (Dataset 4): ST-21 complex (n=1800 isolates, up to 100 isolates randomly selected from each country) C_LIO_LIPopulation structure exploration (sequence type) (Dataset 5): ST-6175 (n=321 isolates, genomes with good cgMLST v2 annotation) C_LI 2. The software for LIN code development is publicly available as follows: O_LIMSTclust for pairwise distance matrices https://gitlab.pasteur.fr/GIPhy/MSTclust [2] C_LIO_LIPython script to define LIN codes in a local dataset; (https://gitlab.pasteur.fr/BEBP/LINcoding) C_LIO_LIBIGSdb Perl script to define LIN codes from cgMLST profiles on the PubMLST database; (https://github.com/kjolley/BIGSdb/blob/develop/scripts/maintenance/lincodes.pl) C_LI

2
Assessment of Oxford Nanopore whole genome sequencing for large-scale genomic characterisation of Staphylococcus aureus

Haugan, I.; Flatby, H. M.; Lysvand, H.; Skei, N. V.; Zaragkoulias, K.; Solligard, E.; Ronning, T. G.; Olsen, L. C.; Damas, J. K.; Afset, J. E.; As, C. G.

2026-04-01 genomics 10.64898/2026.03.30.715209 medRxiv
Top 0.1%
40.7%
Show abstract

Whole-genome sequencing (WGS) is increasingly being utilised in microbial diagnostics, surveillance, and research. In this paper we assess the performance of one leading long-read sequencing technology, Oxford Nanopore Technology (ONT), on 836 Staphylococcus aureus bacteraemia isolates. We compare the results to that of a leading short-read sequencing technology, Illumina. All isolates were sequenced using ONT MinION Mk1B and Illumina HiSeq or MiSeq. Libraries were prepared according to manufacturers instructions. Preprocessing and downstream bioinformatic analyses were performed using a combination of in-house pipelines and publicly available software tools. The average base substitution error rate in ONT assemblies was low but varied between sequence types, possibly due to lineage-specific methylation patterns. Multi locus sequence typing was similar between the technologies, while ONT assemblies allowed for better spa typing than Illumina assemblies. The reported detection rate was similar between ONT and Illumina assemblies for most virulence- and AMR-associated genes and variants. For 42 (22.2%) of 189 genes/variants, the two technologies disagreed in gene detection in 5 isolates or more, and in 39 (20.6.%) of these the highest detection rate was found with ONT. Discrepancies were mainly associated with low GC content, multiple repetitive segments, and small plasmids. Polishing of ONT data resulted in minor changes in gene/variant calling. Our study supports the use of ONT WGS for bacterial population genomic studies on a large collection of S. aureus isolates. While assembly of ONT reads may be affected by its own methodological limitations, it was superior to Illumina assemblies in detection of potentially clinically relevant genes and variants at a low read error rate. Understanding the advantages and limitations of WGS technologies is essential before undertaking studies involving such methods on large sets of bacteria. Author summaryIn this paper, we present a practical assessment of one important whole genome sequencing (WGS) method, Oxford Nanopore Technology (ONT), and compare its performance in bacterial population genomics to that of WGS with Illumina technology. Our goal was to investigate the usefulness of ONT in studies aiming to identify clinically relevant bacterial characteristics in large collections of bacteria, such as genotype-phenotype studies. We sequenced a large set of clinical S. aureus isolates from episodes of bloodstream infections using both ONT and Illumina technologies and performed analyses with widely used software and bioinformatic pipelines. We have elucidated inherent strengths and limitations of ONT and Illumina sequencing and report some of the practical consequences of these on bacterial typing and detection of clinically relevant genes. With this study, we present one of the most comprehensive assessments of long-read sequencing technology for the genomic characterisation of clinical bacterial isolates, and the findings provide guidance for researchers considering WGS in large-scale bacterial genomics.

3
Towards a holistic epidemiology of Streptococcus agalactiae using the BakRep repository

Fenske, L.; Schwengers, O.; Goesmann, A.

2026-03-03 genomics 10.64898/2026.03.02.709001 medRxiv
Top 0.1%
38.0%
Show abstract

Streptococcus agalactiae is a versatile multi-host pathogen that can cause major neonatal disease in humans, as well as mastitis in dairy animals. Its ability to infect a wide range of hosts is largely driven by its high genomic plasticity and the acquisition of distinct accessory genes. The global population of S. agalactiae is characterized by multiple of capsular serotypes and clonal complexes that differ in their propensity to cause invasive disease, including hypervirulent CC17 (often serotype III) associated with neonatal meningitis, whereas CC1/CC19/CC23 are more often colonizing lineages. Although widely studied, most research is limited to particular regions or single outbreak events, offering only fragmented snapshots instead of a comprehensive global picture. To move beyond region- or outbreak-limited studies, this work has analyzed 37970 S.agalactiae genomes from BakRep, integrating serotypes, MLST, AMR genes, lineage-specific genes, and descriptive metadata to map current trends and identify potential gaps in public data. The dataset largely matched the known population structure with serotype III, Ia and V most common and stable serotype/clonal complex lineages (e.g. III-2/CC17, Ia/CC23, CC1/V), while also rising serotype diversity. Lineages differed in their accessory-gene profiles, with III-2/CC17 being enriched for virulence and adhesion genes, while other groups showed either greater genomic plasticity (mobile/phage genes) or niche specialization. AMR was widespread with very high tetracycline resistance (>80%), frequent MLSB resistance determinants, and emerging aminoglycoside resistance in some genomes. But overall it became evident that the associated metadata contained substantial gaps. Missing or incomplete information limits biological interpretation, underscoring that rigorously curated, structured metadata is essential for maximizing the value of ongoing sequencing efforts.

4
Reference-free clustering as an epidemiological tool for Mycobacterium tuberculosis lineage typing

Chilengue, A. F.; Whiley, D. J.; Cox, K.; Domingo-Sananes, M. R.; Meehan, C. J.

2026-02-06 bioinformatics 10.64898/2026.02.05.703994 medRxiv
Top 0.1%
37.5%
Show abstract

Whole-genome sequencing (WGS) of Mycobacterium tuberculosis (Mtb) is widely used in the epidemiological investigation of recent transmission events, resulting in high-resolution strain typing. Accurate and rapid strain typing is essential for informing outbreak investigations and guiding tuberculosis control strategies. However, the gold-standard reference-guided SNP-calling pipeline currently used for strain typing relies on computationally intensive reference-mapping approaches, making it challenging to perform in many high-burden, resource-limited settings, where simplified and scalable genomic tools are urgently needed. To address these limitations, we explored reference-free methods for medium resolution epidemiology, namely Mtb strain (lineage) typing, using a dataset of 535 complete genomes spanning the human- and animal-adapted lineages. Illumina paired-end reads were simulated from each complete genome, assembled, and analysed using three reference-free, k-mer-based tools: MASH, PopPUNK, and SKA2 (Split K-mer Analysis). Genetic distances were generated for each method and compared with a ground truth lineage assignment from with TB Profiler. Our results demonstrated that reference-free methods can effectively distinguish Mtb lineages, with SKA2 showing the most promising performance across all datasets. SKA2 consistently recovered lineage and sub-lineage structure with high accuracy, demonstrating strong potential as an alternative to traditional WGS workflows. These findings highlight the utility of reference-free methods, particularly SKA2, for enabling accessible, scalable, and rapid Mtb strain typing, while supporting genomic epidemiology with low computational resources.

5
Epidemiology of Legionella: Genome-bAsed Typing (el_gato) - a new bioinformatic tool for identifying sequence-based types of Legionella pneumophila from whole genome sequencing data

Collins, A. J.; Mashruwala, D.; Chivukula, V.; Kozak-Muiznieks, N. A.; Rishishwar, L.; Norris, E. T.; Willby, M. J.; Hamlin, J.; Overholt, W. A.

2026-03-23 bioinformatics 10.64898/2026.03.20.713011 medRxiv
Top 0.1%
37.5%
Show abstract

Sequence-based typing (SBT) via Sanger sequencing has been the standard for describing Legionella pneumophila relationships for two decades. SBT involves sequencing seven loci, identifying alleles using the United Kingdom Health Security Agency (UKHSA) database, and inferring the corresponding sequence type (ST). While similar SBT approaches for other organisms can be easily adapted to whole genome sequencing (WGS), L. pneumophila presents two known challenges for this adaptation: multiple copies of one locus (mompS) and extensive heterogeneity in a second locus (neuA/neuAh). Although several computational methods have been proposed to address these issues, a WGS-based replacement with equal resolution to traditional SBT has been elusive. To address this gap, we developed el_gato (Epidemiology of Legionella: Genome-bAsed Typing; https://github.com/CDCgov/el_gato), which offers several advantages over existing methods: (1) a novel approach for resolving multiple mompS alleles identified in the same isolate, (2) the ability to capture diverse neuA/neuAh alleles, (3) fast runtime with an average of 27.7 seconds per sample, (4) easy installation via Bioconda or Docker and (5) an updated database as of March 2025. el_gato works with either paired-end short reads or genome assemblies, performing more accurately with paired-end short reads at least 250 base pairs (bp) in length. We compared el_gato against two other in silico SBT tools ("mompS", hereafter referred to as mompS tool and "legsta") using a dataset of 441 isolates with sequence types (STs) previously determined by Sanger-based sequencing. el_gato correctly identified the ST for 98.9% of the test isolates, compared to 95.2% for the mompS tool and 42.2% for legsta, demonstrating a significant improvement compared to the mompS tool (adjusted p = 1.06e-3) and legsta (adjusted p = 4.24e-55) in ST identification. Furthermore, el_gatos determination of ST was not significantly different from Sanger sequencing (adjusted p = 0.442). In summary, el_gato significantly improves in silico SBT and given its growing adoption, is poised to support the public health community.

6
Salmonella Genomic Markers for Risk to Food Safety

Waters, E. V.; Hill, C.; Orzechowska, B.; Cook, R.; Jorgensen, F.; Chattaway, M. A.; Langridge, G. C.

2026-03-30 genomics 10.64898/2026.03.27.714810 medRxiv
Top 0.1%
33.6%
Show abstract

Foodborne non-typhoidal Salmonella remains a major public health concern, yet routine surveillance recovers large numbers of isolates from food that are not associated with human illness. Studies have shown foodborne isolates can be genetically linked to clinical cases, highlighting a critical challenge for risk assessment and outbreak prioritisation. This study aimed to determine whether genomic markers can distinguish foodborne Salmonella strains with an increased likelihood of causing infection. Whole-genome sequencing data from over 900 Salmonella isolates recovered from food and the environment through UK Health Security Agency surveillance were analysed using hierarchical clustering to define genetically related groups. These clusters were expanded using the global EnteroBase database to provide broader epidemiological context. Genome-wide association analyses identified genetic markers associated with clusters containing clinical isolates, including phage-associated regions. A highly conserved 7 kb marker identified in S. Agona demonstrated strong predictive performance at a global scale, with high sensitivity and specificity for infection-associated lineages and strict serovar restriction. Comparative genomic analysis revealed that all markers localised to a shared chromosomal hotspot corresponding to a prophage integration site. The 7 kb risk-associated marker formed part of a larger prophage closely related to the well-characterised S. Typhimurium Fels-2 phage, which encodes a DNA invertase linked to phase variation, a mechanism known to promote phenotypic heterogeneity and host adaptation. As these S. Agona isolates are monophasic, our findings indicate that our genome-wide association approach has rediscovered this DNA invertase known to contribute to infection risk but in a different serovar via an alternative regulatory mechanism. Overall, this work demonstrates the potential to move beyond treating all foodborne Salmonella isolates as equivalent hazards, towards a genomics-informed framework for risk stratification. This approach provides a foundation for improved risk-based decision-making, enhance outbreak investigations and enable earlier prioritisation of public health responses during Salmonella surveillance and control. Author summaryFoodborne Salmonella infections remain a major public health concern, but not all strains pose the same risk to human health. Here we investigated whether genetic differences could explain why some foodborne strains are more likely to cause human infection. We analysed over 900 genomes from food and environmental sources, grouping closely related strains before placing them in a global context using EnteroBase. By combining pangenome and genome-wide association analyses, we identified distinct lineages within several serovars that differed in their association with human cases. In Salmonella Agona, all clinical isolates belonged to a single lineage carrying a highly conserved 7 kb marker that was absent from low-risk strains. This marker demonstrated strong sensitivity and specificity across global datasets and was located within a prophage closely related to the well-characterised Fels-2 phage. This region encodes a DNA invertase previously linked to phase variation, a mechanism that promotes bacterial adaptability. Our findings indicate that infection risk can be structured at the lineage level and influenced by mobile genomic elements, particularly prophages, that enhance environmental persistence and host adaptation. This work advances genomic surveillance from retrospective linkage towards mechanistic and predictive risk assessment, with direct relevance for supporting risk-based decision-making during outbreak investigations.

7
Mobile element-mediated carbapenem resistance in Enterobacter hormaechei in a Nigerian intensive care unit

Mba, I. E.; Odih, E. E.; Adekanmbi, O.; Oaikhena, A. O.; Sunmonu, G. T.; Adebiyi, I.; Gbaja, A. T.; Animashaun, O.; Osadebamwen, P.; Idowu, O.; Aanensen, D. M.; Okeke, I. N.

2026-04-10 microbiology 10.64898/2026.04.09.712135 medRxiv
Top 0.1%
32.7%
Show abstract

Carbapenem-resistant Gram-negative bacteria pose a critical public health threat. The role of mobile genetic elements in driving their transmission and persistence remains poorly defined. In 2022, we investigated a suspected outbreak of carbapenem-resistant Acinetobacter baumannii (CRAB) in a Nigerian adult intensive care unit (ICU), using short-read whole genome sequencing (WGS) of carbapenem-resistant clinical and environmental isolates during the cluster period. Mobile element dynamics were then inferred from hybrid assemblies of Illumina and Oxford Nanopore reads. The suspected CRAB outbreak was ruled out by WGS but a carbapenem-resistant Enterobacter hormaechei ST114 bloodstream isolate was found to be indistinguishable from two environmental isolates, all recovered during the Acinetobacter surge. Hybrid assemblies revealed a strikingly conserved [~]19 Kb resistance island shared across all ST114 genomes. The island contained a blaNDM-5 cassette alongside many other antimicrobial resistance genes, within class 1 integronns and flanked by insertions sequences, located on a 46,176 bp plasmid. Using the ST114 plasmids hybrid assembly as scaffold, the same plasmid was identified in the genome of a Klebsiella pneumoniae ST15 isolate from the ICU environment during the same period. Additionally, re-interrogation of genomic surveillance data uncovered four clonal 2020 ST109 Enterobacter bloodstream isolates from the same facility that carried the resistance genes in the same context on a large 267,242 bp plasmid. Carbapenem resistance in hospital Enterobacterales is driven by both clonal expansion and horizontal spread of mobile resistance elements. These findings underscore the need to track mobile elements alongside bacterial lineages to inform evidence-based infection control, especially in low-resource settings. Impact StatementCarbapenem resistance among Enterobacterales remains a major public health threat, yet how mobile genetic elements contribute to their persistence and spread in hospital settings is still poorly understood. In this study, we investigated a suspected outbreak of carbapenem-resistant Acinetobacter baumannii in an adult intensive care unit in Nigeria. Although the outbreak was eventually ruled out, genomic analysis has shown the importance of careful interpretation of suspected outbreak cases in hospital settings. Our findings highlight the importance of close monitoring of ICU environments, the implementation of blood culture-based diagnostics, and the value of genomic support in outbreak investigations. These findings demonstrate that carbapenem resistance in hospital Enterobacterales is driven not only by clonal expansion but also by the horizontal dissemination of a highly stable blaNDM-5-associated MDR island capable of integrating into diverse plasmid backbones. This study emphasizes the need for genomic surveillance that tracks both mobile elements and bacterial lineages to strengthen outbreak investigations, especially in low-resource settings. It further underscores the links between clinical and environmental AMR reservoirs and reinforces the value of a One Health approach to controlling carbapenem resistance. Data summaryFASTQ sequences were deposited in the NCBI BioSample database under accession numbers SAMN55915584 - SAMN55915597.

8
Molecular analysis of Lancefield group C/G streptococci causing human infections in Sheffield, UK.

Bah, S. Y.; Khalid, H.; Jabang, S.; Chaudhuri, R.; Tilley, L.; Green, L. R.; Partridge, D.; de Silva, T. I.; Turner, C. E.

2026-01-30 microbiology 10.64898/2026.01.30.702767 medRxiv
Top 0.1%
32.3%
Show abstract

Lancefield group C/G streptococci (GCS/GGS) are increasingly recognised as significant human pathogens that cause a disease spectrum similar to Streptococcus pyogenes. Despite their high clinical burden in the UK, their genomic diversity remains poorly understood. We performed whole-genome sequencing (WGS) on a prospective collection of 109 consecutive GCS/GGS isolates from all infection types in Sheffield, UK, over five months in 2020. Streptococcus dysgalactiae subsp. equisimilis (SDSE) accounted for 104 isolates, while five were identified as Streptococcus canis. The SDSE population was highly diverse, comprising 15 genomic clusters and 38 unique emm-ST combinations. We identified the presence of the ST20/stG62647 international lineage (24% of isolates), a cluster globally associated with severe invasive disease. Antimicrobial resistance genes were prevalent (49%), predominantly linked to mobile genetic elements carrying tetracycline and macrolide resistance. Furthermore, a variation in the penicillin-binding protein PBP2X (P601L) was linked to reduced penicillin sensitivity (MIC 0.03 mg/L). There were few or no genetic changes in isolates obtained from the same patient, even when they were collected 8-10 weeks apart, indicating long-term persistence within a host. The unexpected detection of S. canis in human infections and the high diversity of SDSE, persistence and virulence-associated regions underscore the need for enhanced national genomic surveillance to track emerging virulent and antibiotic-resistant SDSE lineages. Impact statementLancefield group C and G streptococci, most often the species Streptococcus dysgalactiae subsp. equisimilis (SDSE), are an increasingly significant human pathogen, often mirroring the severity of infections caused by Lancefield group A Streptococcus (S. pyogenes). Despite its clinical importance, we know little about the population of SDSE circulating in the UK. This study provides the first comprehensive genomic analysis of SDSE isolates from a single UK region, identifying a highly diverse population comprising 15 distinct genomic clusters but with evidence of long-term persistence within a single host. Notably, we confirm the presence of the international stG62647/ST20 lineage in the UK, which is globally associated with severe invasive disease. Our findings also reveal a high prevalence of antimicrobial resistance genes ([~]49%), primarily linked to mobile genetic elements, and the presence of a specific variation in the penicillin-binding protein PBP2X that reduces penicillin sensitivity. Additionally, the unexpected detection of S. canis in human infections rather than animals highlights a need for monitoring. By defining the UKs SDSE population structure and its resistance landscape, this research underscores the critical need for enhanced national genomic surveillance to track emerging high-virulence and antibiotic-resistant lineages Data summarySequence files for isolates from Sheffield used for this study have been uploaded to the sequence read archive with project accession number PRJNA1333937 and accession numbers provided in Supplementary Dataset 1. The completed genome for SDE096 been deposited on GenBank with the accession number JBSXMJ000000000.

9
Tn3-derived inverted-repeat miniature elements (TIMEs) that mobilize antibiotic resistance genes

Gomi, R.; Yano, H.

2026-02-25 microbiology 10.1101/2025.11.05.686661 medRxiv
Top 0.1%
26.7%
Show abstract

Miniature inverted-repeat transposable elements (MITEs) are nonautonomous mobile genetic elements (MGEs) that can be mobilized by transposases provided by the relevant autonomous MGEs. MITEs originating from Tn3-family transposons were previously termed Tn3-derived inverted-repeat miniature elements (TIMEs). Composite transposon-like structures bounded by two copies of TIME, called TIME-COMPs, were shown to mobilize the intervening sequences. However, their association with antibiotic resistance genes (ARGs) has not yet been systematically studied. This study thus aimed to identify new TIME-COMP-like structures containing ARGs in the genomic sequences of the clinically important bacterial family Enterobacteriaceae in public databases. TIME-COMP-like structures were first searched for in the plasmid database PLSDB, focusing on small plasmids, using a self-against-self blastn approach to identify repeated elements. Then, newly and previously identified MITEs (including TIMEs) were searched for in the NCBI core nucleotide database to identify TIME-COMP-like structures located on other replicons. Bioinformatic analysis identified multiple previously unreported TIME-COMPs containing ARGs, which are bounded by directly or inversely oriented TIMEs, namely, IS101, MITESen1, and a novel 244-bp TIME termed TIME244. TIME244 contains a putative resolution site related to that of Tn21. These TIMEs were predominantly detected in plasmids and very rarely in chromosomes. The ARGs embedded in newly identified TIME-COMPs were blaKPC-2, floR, qnrS1, and tet(A). Notably, the blaKPC-2 carbapenemase gene was found in TIME-COMPs bounded by TIME244 and a TIME-COMP bounded by IS101. These findings highlight a potential role for TIMEs in the spread of diverse ARGs. IMPACT STATEMENTBacterial miniature inverted-repeat transposable elements (MITEs) are a group of short (50 bp-500 bp) nonautonomous transposable elements that are thought to have originated from insertion sequences or transposons. Although MITEs can theoretically mobilize antibiotic resistance genes (ARGs) in the presence of transposases, only a few studies have reported their association with ARGs, probably due to difficulties in identifying MITEs in genomic sequences. This study provides evidence, based on bioinformatic analysis of public Enterobacteriaceae genomes, that a subset of MITEs, called Tn3-derived inverted-repeat miniature elements (TIMEs), mobilizes ARGs by forming composite transposon-like structures. A novel 244-bp TIME, designated TIME244, was present in more than 100 Enterobacteriaceae plasmids in the current RefSeq database, suggesting its further transmission in bacterial populations through horizontal gene transfer. This study reveals that TIMEs were often overlooked when analyzing the genetic contexts of ARGs in previous studies. These findings highlight the importance of TIMEs in bacterial gene acquisition and underscore the need for new tools that can detect TIMEs in bacterial genomes for ARG surveillance. DATA SUMMARYAccession numbers of sequence data analyzed in this study are provided within the article or in supplementary data files.

10
A mobile ESX type VII secretion system enhances intracellular persistence in globally distributed Mycobacterium abscessus

Ferrell, K. C.; Buultjens, A. H.; Warner, S.; Alca, S.; Bustamante, A.; Sim, E.; Martinez, E.; Sintchenko, V.; Counoupas, C.; Stinear, T. P.; Triccas, J.

2026-01-26 microbiology 10.64898/2026.01.26.701661 medRxiv
Top 0.1%
25.7%
Show abstract

Mycobacterium abscessus are non-tuberculous mycobacteria that are widespread in the environment and of increasing global clinical significance. Accumulating evidence shows that M. abscessus has emerged as an important pathogen, driven by highly drug-resistant lineages, enhanced transmissibility and the acquisition of specific virulence factors. In this study, we describe a previously uncharacterised ESX secretion system encoded on a 123-kbp plasmid identified in a clinical isolate of M. abscessus. This ESX system, termed ESX-pMA07, is distinct from ESX systems previously reported in M. abscessus in both sequence composition and locus organisation, characterised by a unique arrangement of core ESX components and low sequence identity to ESX-3, ESX-4 and plasmid-borne ESX-P systems. ESX-pMA07 was detected in geographically diverse clinical isolates but was restricted to particular genotypes within the global M. abscessus phylogeny. Transcriptional profiling revealed expression of ESX-pMA07 components in artificial cystic fibrosis media and during intracellular growth in macrophage cell lines. Using CRISPR interference, we show that inducible silencing of eccC, encoding the ATPase component of ESX-pMA07, significantly reduced intracellular survival of M. abscessus within macrophages. To our knowledge, this is the first characterisation of a functional, plasmid-borne ESX secretion system in M. abscessus, demonstrating that mobile genetic elements contribute to the pathogens intracellular persistence and may influence its evolving virulence. Author SummaryMycobacterium abscessus is a rapidly emerging, highly drug-resistant bacterium that causes chronic infections, particularly in people with underlying lung disease such as cystic fibrosis. The factors that enable certain M. abscessus strains to persist inside host cells are not fully understood. In this study, we identified a previously unrecognised type VII secretion system (ESX) encoded on a large plasmid in a clinical isolate of M. abscessus. This plasmid-borne ESX system, which we termed ESX-pMA07, is genetically distinct from the ESX systems normally found on the chromosome and was detected in geographically diverse clinical isolates, but restricted to specific lineages within the global M. abscessus population. We show that ESX-pMA07 genes are expressed under conditions relevant to lung infection and during intracellular growth in macrophages. Using inducible CRISPR interference to silence the ESX ATPase gene eccC, we demonstrate that ESX-pMA07 contributes to intracellular survival of M. abscessus in macrophages. These findings reveal that mobile genetic elements can encode functional secretion systems that enhance intracellular persistence, providing a mechanism for the emergence and spread of virulence traits in this important pathogen.

11
Transfer potential of F-like plasmids in Escherichia coli differs by animal environment

Sundar, S.; Bonhoeffer, S.; Huisman, J. S.

2026-02-03 microbiology 10.64898/2026.02.02.703319 medRxiv
Top 0.1%
25.5%
Show abstract

Plasmids play a key role in the spread of virulence and antimicrobial resistance genes to new genetic backgrounds. Genetic variation in the transfer operon, the genes responsible for conjugation, can lead to substantial differences in transfer potential even between closely related plasmids. However, it is not clear how much genetic diversity there is in transfer operons of natural bacterial populations. Here, we analyze the prevalence and transfer potential of F-like plasmids, a clinically important family of plasmids in Enterobacteriaceae. Using 1200 Escherichia coli genomes isolated from three livestock-associated environments, we find that the fraction of F-like transfer operons that are functionally complete was significantly higher in poultry than in bovine and swine associated bacteria. This difference was not captured in methods that use the presence of replication genes to estimate plasmid prevalence. Confounders such as the phylogenetic relatedness of E. coli or the presence of antibiotic resistance could not explain these significant differences in transfer potential. Instead, it seems the poultry environment selects for plasmids with high transfer potential, as it also contained more conjugative plasmid types per isolate. While we find environment specific differences in overall plasmid frequency, patterns of transfer gene presence/absence were similar across the three environments. Regulatory and exclusion genes are the exception to this pattern, suggesting environment specific modulation of transfer rates. This highlights the use of genomic data to uncover environment specific differences in plasmid prevalence and transfer potential, revealing the selection pressures shaping horizontal gene transfer in these environments.

12
Genomic analysis of Klebsiella pneumoniae causing community-acquired respiratory deaths among Zambian infants and children using targeted RNA-probe hybridization-capture metagenomics

Lindstedt, K.; Wheelock, A.; Samutela, M.; Kabir, W.; Chasaya, M.; Namuziya, N.; Marsden, E. J.; Kapasa, M.; Mumba, C.; Mulenga, B.; Nkole, L.; Pieciak, R.; Mudenda, V.; Chikoti, C.; Ngoma, B.; Chimoga, C.; Chirwa, S.; Pemba, L.; Nzara, D.; Lungu, J. T.; Forman, L.; Simulundu, E.; MacLeod, W.; Moyo, C.; Somwe, S. W.; Holt, K. E.; Sundsfjord, A.; Gill, C. J.

2026-02-05 microbiology 10.64898/2026.02.02.703236 medRxiv
Top 0.1%
23.1%
Show abstract

Klebsiella pneumoniae (Kp) is a leading cause of neonatal and infant deaths in sub-Saharan Africa and frequently associated with antimicrobial resistance. Previously, we identified Kp as a major cause of fatal community-associated lower respiratory infections among infants and children under five years in Lusaka, Zambia, using postmortem tissue sampling and pathogen specific multiplex qPCR. In this follow-up study, we employed a novel culture-independent RNA-probe hybridization-capture metagenomic sequencing approach, targeting Kp pan-genome core and accessory genes, to perform in-depth genomic analysis of Kp from eleven post-mortem lung biopsy samples from seven of these children. Analysis detected Kp in all cases except one, which identified Klebsiella quasipneumoniae subspecies similipneumoniae. Core-genome multi-locus sequence typing (cgMLST) revealed six clonal groups (CG607, CG1123, CG10072, CG280, CG3648, and CG10344) belonging to five sublineages (SL607, SL17, SL280, SL37, and SL10072), with perfect concordance between paired samples from the same case. Two infants sampled the same month harbored SL607 lineages sharing 621 out of 629 cgMLST alleles, suggesting clonal spread. Kp capsule (K) loci were detected in all but one case and included potential vaccine targets KL25, KL23, and KL122. Antimicrobial resistance genes were widespread among samples, particularly encoding resistance toward aminoglycosides, {beta}-lactams, sulphonamides, tetracyclines, and trimethoprim. Extended spectrum {beta}-lactamases were identified in four cases, three of which were blaCTX-M-15. The acquired Kp sideophore yersiniabactin (lineage ybt14) was identified in both cases associated with SL607, and the acquired siderophore aerobactin (lineage iuc5) was identified in one of these, suggesting possible convergence of antimicrobial resistance and hypervirulence. The detection of Kp with extensive antimicrobial resistance causing fatal community acquired pneumonia signals a deeply concerning epidemiologic shift from a largely nosocomial pathogen. This calls for urgent epidemiological investigations to better understand the burden, transmission dynamics, antimicrobial resistances, and potential vaccine targets for Kp in other community settings across sub-Saharan Africa. Author SummaryKlebsiella pneumoniae is a major cause of infections and death among newborns and young children, particularly in low-income countries, where it is frequently resistant to antibiotics. While well-known as a hospital-associated pathogen, we previously showed K. pneumoniae is also a leading cause of fatal community lung infections among infants and children in Lusaka, Zambia. In this follow-on analysis, we performed deeper genetic analysis of K. pneumoniae detected from the cluster of community pneumonia deaths using lung tissue samples from seven of these children. Since traditional bacterial cultures were unavailable, we instead used a novel approach that enriched and sequenced specific regions of the K. pneumoniae genome directly from the biopsy samples without culturing bacterial isolates. We identified five different K. pneumoniae genetic subtypes, known as sublineages. Two sublineages, which came from children sampled the same month, were highly similar, suggesting clonal spread. Multiple acquired antimicrobial resistance genes were detected across all sublineages. Acquired virulence factors, which may cause more aggressive infections, were also detected in two cases. We also identified capsule types previously suggested as potential vaccine targets. This study underscores the urgent need to better understand and address the emerging burden of antibiotic-resistant K. pneumoniae pneumonia and other invasive infections among infants and children in community settings in sub-Saharan Africa.

13
Co-infections and cryptic pathogens uncovered by metatranscriptomics in New Zealands severe acute respiratory infections

Holdsworth, N.; French, R.; Waller, S.; Jelly, L.; Oneill, M.; de Vries, I.; Dubrelle, J.; French, N.; Bloomfield, M.; Winter, D.; Huang, Q. S.; Geoghegan, J. L.

2026-03-24 genomics 10.64898/2026.03.19.712874 medRxiv
Top 0.1%
22.6%
Show abstract

Severe acute respiratory infections (SARI) are a leading cause of hospitalisation and mortality globally. Many SARI cases remain undiagnosed because kit-based PCR diagnostic panels are typically limited to one or a small number of known pathogens and may fail to identify low-abundance infections or novel, poorly characterised organisms. Here, we used metatranscriptomic sequencing to profile the total infectome of 300 PCR-negative SARI nasopharyngeal samples collected through sentinel hospital-based surveillance in New Zealand between 2014-2021. Our analysis revealed actively transcribing potential pathogens in 43% of SARI cases, comprising 10 RNA viruses, three DNA viruses, nine bacterial species and four fungal species. Notably, co-infections occurred in 26% of cases, revealing polymicrobial infections missed by routine diagnostics. Human rhinoviruses were the most frequently identified, despite not being detected by PCR, and multiple common-cold coronaviruses, human parechovirus A1 and parainfluenza virus type 4, were identified, although these were not included in the PCR screening panel. We also detected a range of bacterial and fungal species and uncovered highly expressed virulence and antimicrobial resistance genes. Infectome composition and diversity were shaped by key demographic and epidemiological factors, with strongest effects observed for age and year of sample collection, indicating that host characteristics and temporal dynamics influence both microbial richness and community structure. These findings highlight the limitations of current diagnostic strategies and the value of metatranscriptomics for comprehensive microbial identification. Integrating such genomic approaches into both clinical and public health frameworks could improve diagnostic accuracy, enabling more sensitive detection and characterisation of potential pathogens while also strengthening surveillance and outbreak response.

14
Genomic Signatures and Prediction of Clinical Severity in Klebsiella pneumoniae infections in a Multicenter Cohort

Malaikah, M.; Alyami, R. Y.; Huang, J.; Fallatah, O. A.; Milner, M.; Zhou, G.; Hirayban, R.; Iftikhar, S.; Banzhaf, M.; Li, Y.; Senok, A.; Hala, S. M.; Batook, N.; Alsharif, D.; AlShahrani, A. s.; Alamri, A. W.; AlJohani, S. M.; Kaaki, M. M.; Alalwan, B.; Absar, M.; Ali, M. E. M.; Sadah, H. S.; Zakri, S.; Bosaeed, M.; Pain, A.; Moradigaravand, D.

2026-02-03 infectious diseases 10.64898/2026.02.02.26345332 medRxiv
Top 0.1%
22.4%
Show abstract

Klebsiella pneumoniae is a major causative agent of hospital-acquired infections worldwide, contributing substantially to morbidity, mortality, and healthcare burden.. The emergence of strains that combine resistance to last-resort antimicrobials with hypervirulence has become a pressing public-health challenge. Despite extensive characterization of the genetic determinants of multidrug resistance and hypervirulence, the relationship between the genetic repertoire of K. pneumoniae and the clinical severity of infections remains inadequately understood. MethodsWe analyzed a nationwide large-scale collection of 1,306 K. pneumoniae complex strains retrieved over seven years from five centres across the Kingdom of Saudi Arabia. Using detailed and comprehensive patient-level clinical data, We employed a range of regression analyses, genome-wide association study (GWAS) methods, and machine-learning approaches to elucidate the clinical significance of ESBL/carbapenemase-producing (ESBL/CP), hypervirulent, and convergent ESBL(+)/CP(+) hypervirulent K. pneumoniae strains. We examined clinical severity outcomes including in-hospital all-cause mortality rate, ICU admission rate and length of hospitalisation (LOS) across these K. pneumoniae types, identified genome-wide determinants linked with clinical severity and used machine learning approaches to predict clinical severity outcomes from genomic biomarkers together with clinical metadata. ResultsInfections caused by convergent strains exhibited the greatest clinical severity, showing nearly double the in-hospital mortality (reaching 42% at 90 days), a 2.4-fold higher likelihood of ICU admission, and an average 150% increase in LOS compared to infections caused by susceptible and non-hypervirulent strains. Our findings indicate an additive effect of hypervirulence and multidrug resistance on disease severity. Carbapenem resistance determinants showed the strongest association with adverse outcomes, even after adjusting for the presentce of other resistance and virulence genes and clinical confounder features. The GWAS analysis revealed associations of the clinical outcomes with accessory genes involved in carbohydrate metabolism and the Type VI secretion system (T6SS) machinery, metabolic-adaptation and stress-tolerance/persistence loci. Additional significant associations were identified with SNPs in ABC-transporters, cell-envelope systems, sugar transporter families and RND-family efflux systems. Machine-learning models yielded average Area Under the Curve (AUC) values of 0.78 and 0.79 for mortality and ICU admission, respectively, and exhibited strong monotonic association between observed and predicted outcomes for LOS, with an average correlation of 0.59 on unseen test data when trained using combined genomic and clinical predictors. ConclusionThis study identifies key genomic determinants that drive severe K. pneumoniae infections, with carbapenem-resistance markers emerging as the leading contributors to poor clinical outcomes. The strong predictive performance of genomic biomarkers, particularly for mortality, ICU admission, and LOS, highlights their value in enhancing diagnostic precision, improving clinical risk stratification, and informing targeted infection-prevention strategies.

15
A Life Identification Number Barcoding (LIN Code) System for Neisseria meningitidis: high resolution multi-level typing of meningococci.

Parfitt, K. M.; Jolley, K. A.; Unitt, A.; Bray, J. E.; Colles, F. M.; Harrison, O. B.; Feavers, I. M.; Maiden, M. C.

2026-03-03 microbiology 10.64898/2026.03.03.708563 medRxiv
Top 0.1%
22.2%
Show abstract

Neisseria meningitidis is a commensal member of the human oropharyngeal microbiota that can cause devastating invasive meningococcal disease. This genetically and antigenically diverse accidental pathogen has been a paradigm for the study of bacterial population biology. The meningococcus was the first organism for which a seven-locus multi-locus sequence typing (MLST) scheme was developed. With the addition of sequence-based characterisation of antigen genes, molecular typing has been widely employed to inform surveillance and public health interventions. Following the advent and widespread adoption of whole genome sequencing (WGS), precise delineation of variants is possible. Here, a WGS-based Life Identification Number (LIN) code typing scheme is described, providing a multi-resolution nomenclature for understanding meningococcal population diversity and molecular epidemiology. The LIN codes were developed using a set of 6,131 N. meningitidis genomes, comprising up to 200 isolates from each clonal complex (cc) previously described using MLST. Based on cluster-stability analysis and concordance with ccs, thirteen LIN thresholds described the meningococcal population at different levels of resolution. LIN codes and human-readable nicknames consistent with existent nomenclatures were assigned to the N. meningitidis genomes hosted in the PubMLST. Published outbreaks validated the LIN thresholds, illustrating the potential of this genomic tool in public health management.

16
Environmental reservoirs of high-risk ESBL- and carbapenemase-producing E. coli and Klebsiella in maternity wards in Yaounde (Cameroon): Whole-genome sequencing and antimicrobial susceptibility studies

Bessala, G. C.; Abomo, G. D.; Ngamaleu, R.; Essiben, F.; Wheeler, N.; Buckner, M. M. C.; Kreft, J. U.; Bougnom, B. P.

2026-03-18 epidemiology 10.64898/2026.03.16.26348525 medRxiv
Top 0.1%
22.1%
Show abstract

BackgroundThe hospital environment is increasingly recognized as a critical reservoir for antimicrobial-resistant (AMR) bacteria. In sub-Saharan Africa, maternity wards represent high-risk settings where environmental contamination poses a direct threat to vulnerable mothers and neonates. Despite this, there is a significant lack of data integrating phenotypic resistance with whole-genome sequencing (WGS) to understand antimicrobial resistance (AMR) in these settings. This study characterized the AMR patterns and genomic features of ESBL-producing Escherichia coli and Klebsiella spp. isolated from maternity ward surfaces in Yaounde, Cameroon. MethodsA cross-sectional environmental study was conducted across four maternity wards. Isolates were identified via standard microbiological methods, and antimicrobial susceptibility testing against 13 antibiotics was performed following EUCAST 2024 guidelines. Short-read WGS was utilized to identify sequence types (STs), plasmid incompatibility groups, antibiotic resistance genes (ARGs), and virulence factors. Plasmid-ARG association networks were constructed to visualize resistance dynamics. ResultsNineteen ESBL-producing Enterobacterales were identified, comprising 15 E. coli and four Klebsiella isolates. High levels of multidrug resistance were observed against ciprofloxacin, penicillins, and third-generation cephalosporins. While the isolates remained sensitive to colistin and imipenem, alarming resistance to meropenem was detected. Genomic analysis revealed the presence of globally disseminated high-risk lineages, including E. coli ST131, ST1193, and ST410, alongside Klebsiella ST1324 and ST489. Critical resistance determinants, including ESBLs, AmpC enzymes, and carbapenemases (NDM and OXA-48-like), are frequently associated with epidemic plasmids such as IncF, IncA/C2, and IncL/M. Additionally, the isolates harboured virulence factors characteristic of extraintestinal pathogenic Enterobacterales. ConclusionsThe widespread presence of high-risk carbapenemase-producing clones on maternity ward surfaces identifies the hospital environment as a significant AMR reservoir in Yaounde. These findings highlight the urgent need for reinforced infection prevention and control (IPC) measures, robust antimicrobial stewardship, and the integration of genomic surveillance to safeguard highly susceptible maternal and neonatal populations from life-threatening infections.

17
Thirty years of Achromobacter ruhlandii evolution reveal pathways to epidemic lineages

Gabrielaite, M.; Johansen, H. K.; Juozapaitis, J.; Marvig, R. L.; Dudas, G.

2026-03-25 bioinformatics 10.64898/2026.03.25.714254 medRxiv
Top 0.1%
22.1%
Show abstract

BackgroundAchromobacter spp. are emerging opportunistic pathogens, associated with chronic infections, antimicrobial resistance, and poor clinical outcomes. The Danish epidemic strain (DES) of A. ruhlandii is highly drug-resistant and adapted to the cystic fibrosis (CF) airway, yet its evolutionary history and defining genomic features remain poorly understood. MethodsWe analysed genome and antibiotic susceptibility testing data for 58 longitudinally collected DES isolates sampled over 21 years at Rigshospitalet, Denmark. We combined these with 79 publicly available A. ruhlandii genomes and applied phylogenomics to infer DES emergence and transmission, and genome-wide association studies (GWAS) to identify lineage-specific and adaptive genomic features. ResultsDES forms a distinct monophyletic clade within A. ruhlandii, estimated to have emerged around 1990, with no evidence of dissemination beyond Denmark. GWAS identified key lineage-defining traits, including acquisition of large mobile genetic elements, plasmid integration events, and enrichment of resistance and iron acquisition genes. In addition, we detected other epidemic A. ruhlandii lineages with evidence of long-term persistence and inter-country spread, sharing similar genetic signatures of adaptation. ConclusionsThis study elucidates the genomic features associated with chronic infection and epidemic potential in A. ruhlandii. The DES lineage illustrates how extensive horizontal gene transfer, high intrinsic resistance potential, and enhanced host-adaptation traits, such as increased iron acquisition, can facilitate the emergence and persistence of successful epidemic lineages. These findings highlight shared evolutionary signatures of epidemic A. ruhlandii and underscore the need for continued genomic surveillance to detect and monitor emerging high-risk lineages in chronic infections.

18
Refining the Serine Protease Autotransporters of Enterobacteriaceae (SPATE) gene detection in Enteroaggregative Escherichia coli genomes uncovers differential SPATE distribution by phylogeny

Dada, R. A.; Afolayan, A. O.; Adewuyi, O. A.; Tytler, B. A.; Olayinka, B. O.; Thomson, N. R.; Okeke, I. N.

2026-04-16 microbiology 10.64898/2026.04.16.715897 medRxiv
Top 0.1%
22.1%
Show abstract

BackgroundEnteroaggregative Escherichia coli (EAEC) are a heterogenous pathotype, implicated in acute and persistent diarrhoea especially in developing countries. Serine Protease Autotransporters of Enterobacteriaceae (SPATEs) are Type V Secretory System trypsin-like proteases repeatedly reported from EAEC. This study aimed to determine SPATE encoding-gene prevalence among EAEC and their association with diarrhoea. We screened 881 EAEC genomes from four recent epidemiological studies in Nigeria for 23 SPATE-encoding genes, initially using ARIBA and the Virulencefinder database. ResultsInitial screening inflated SPATE gene content, particularly in genomes with multiple SPATEs, due to cross detection of highly similar sequences and other artefacts. We developed and validated refined methodology, which detected 478 of 1,156 original SPATE calls and also identified SPATE miscalls from previous datasets in the literature. The most prevalent SPATE-encoding gene in our EAEC collection was sepA 297(33.71%), closely followed by sat 360 (29.74%). pic, encoding a SPATE with mucinase activity, was found in 65 (7.4%) genomes and associated with diarrhoea (p=0.00004). EAEC strains belonging to E. coli phylogroups A, B1 or C carried, on average, one SPATE gene per genome while >1 was typically detected in phylogroup B2 EAEC. Other EAEC carried few or no SPATE genes. ConclusionsOur study shows that multifunctional genome analysis tools may have to be refined for certain gene families to avoid overestimation. SPATEs are not as prevalent as previously thought but they remain common among EAEC, particularly among phylogroup A, B1, B2 and C, pointing to the possibility that they make lineage-specific contributions to disease.

19
The pQBR mercury resistance plasmids: a model set of sympatric environmental mobile genetic elements

Orr, V. T.; Harrison, E.; Rivett, D. W.; Wright, R. C. T.; Hall, J. P. J.

2026-03-27 microbiology 10.64898/2026.03.27.714766 medRxiv
Top 0.1%
21.9%
Show abstract

Plasmids are extrachromosomal mobile genetic elements that can facilitate rapid bacterial adaptation by transferring genes between individuals. While plasmids are known to exist in diverse habitats and encode a range of traits, most of our knowledge about plasmids comes from clinically-associated antimicrobial resistance (AMR) plasmids that have already been recruited as vectors of drug resistance and have likely been shaped by strong selection for plasmid-encoded resistance. Here, we investigated 26 plasmids from the pQBR collection -- a set of large, co-existing mercury resistance environmental plasmids isolated in Pseudomonas spp. from a field in Oxfordshire in the 1990s -- and explored the ability of pQBR plasmids to mobilise novel chromosomally-encoded traits. New whole genome sequences for 25 plasmids confirmed that these soil-isolated plasmids are generally very large (140-588 kb), constitute at least five distinct genetic groups, and have relatives in various other Pseudomonas species and habitats. Despite significant nucleotide-level divergence, Groups I (pQBR103-like, [~]406 kb) and IV (pQBR57-like, [~]328 kb) showed remarkable ancient similarities in synteny and gene content both with one other, and with the PInc-2 family of plasmids known to mobilise clinically significant drug resistance in Pseudomonas aeruginosa. None of the pQBR plasmids sequenced to date harboured known AMR determinants, but putative phage defence systems and metal resistances were evident. Transposable elements, including the Tn5042 mercury resistance transposon, were responsible for significant structural variation within plasmid groups, consistent with a predominant role of transposons in rapidly remodelling plasmids. To experimentally test the ability of pQBR plasmids to spread new traits, we developed a novel transposon mobilisation assay which showed that certain Group IV pQBR plasmids were especially effective at acquiring the chromosomally-encoded transposon Tn6291, and that this mobilisation was likely due to specific plasmid factors rather than generic conjugation rate. Our work presents a tractable set of sequenced plasmids suitable for exploring the evolution and dynamics of gene acquisition by pre-AMR plasmids, and provides a key case study highlighting the pervasive interplay between plasmids and transposable elements that can drive microbial genome evolution. Repositories: github.com/jpjh/PQBR_PLASMIDS Impact statementPlasmids can drive microbial evolution by acting as vectors for horizontal gene transfer. Because of their central role in disseminating antimicrobial resistance (AMR), plasmids are mainly explored as vehicles for AMR traits, meaning that our knowledge of the diversity and evolutionary dynamics of non-AMR plasmids is more limited. Here, we explore sequences from a set of mercury resistance plasmids isolated in Pseudomonas spp. from pristine agricultural land that lack AMR determinants. By providing new whole genome sequencing analyses we expand the set of sequenced pQBR plasmids to 26, finding globally dispersed relatives from clinical, environmental, and industrial settings, and identifying an ancient plasmid backbone shared amongst divergent modern environmental and clinical AMR plasmids. We experimentally verify the role of pQBR plasmids in readily mobilising chromosomal traits using a novel transposon mobilisation assay, which suggests that specific plasmid-transposon interactions may drive trait spread. Overall, our work expands our understanding of the role of environmental plasmids in mobilising and disseminating adaptive traits.

20
Genomic characterization of Escherichia coli and Enterobacter hormaechei clinical isolates from a tertiary healthcare facility in Kenya

Musundi, S.; Kimani, R. W.; Waweru, H. K.; Wakaba, P.; Mbogo, D.; Essuman, S.; Onyambu, F.; Kanoi, B. N.; Gitaka, J.

2026-04-15 bioinformatics 10.64898/2026.04.13.718279 medRxiv
Top 0.1%
19.3%
Show abstract

Extended-spectrum beta-lactamase-producing Enterobacterales such as Escherichia coli and Enterobacter hormaechei represent a growing public health challenge in clinical settings, particularly in low-and middle-income countries, due to the escalating threat of antimicrobial resistance (AMR). In this study, we aimed to identify the antibiotic resistance genes present in E. coli (n=4) and E. hormaechei (n=3) clinical isolates. Multidrug-resistant phenotypes were confirmed using disc diffusion assays against 20 antibiotics. Whole-genome sequencing of resistant isolates was performed using Oxford Nanopore Technologies. Genome assembly and analysis revealed high-risk clones, including sequence type (ST) 1193 in E. coli and ST78 in E. hormaechei. All E. coli isolates harbored the blaCTX-M gene in their chromosomes along with point mutations conferring resistance to fluoroquinolones, while E. hormaechei isolates encoded blaACT in their chromosomes. Additionally, both species carried plasmids with multiple antibiotic resistance genes, including blaOXA and blaTEM, co-located with metal resistance operons, indicating the potential for horizontal gene transfer. BLAST analysis revealed high sequence similarity between the plasmids identified in clinical isolates and those previously recovered from environmental sources, highlighting the role of environmental reservoirs in AMR dissemination. Notably, no carbapenem resistance genes were detected in any isolate. These findings underscore the growing threat posed by multidrug-resistant Enterobacterales in clinical settings and emphasize the urgent need for strengthened infection prevention and control measures to mitigate AMR spread.